Document Listing for Queries with Excluded Pattern
نویسندگان
چکیده
Let D = {d1, d2, ..., dD} be a given collection of D string documents of total length n. We consider the problem of indexing D such that, whenever two patterns P and P− comes as an online query, we can list all those documents containing P but not P−. Let t represent the number of such documents. An index proposed by Fischer et al. (LATIN, 2012) can answer this query in O(|P|+ |P−|+ t+ √ n) time. However, its space requirement is O(n) bits. We propose the first linear-space index for this problem with a worst case query time of O(|P|+ |P−|+ √ n log logn+ √ nt log n).
منابع مشابه
Succinct data structures for flexible text retrieval systems
We propose succinct data structures for text retrieval systems supporting document listing queries and ranking queries based on the tf*idf (term frequency times inverse document frequency) scores of documents. Traditional data structures for these problems support queries only for some predetermined keywords. Recently Muthukrishnan proposed a data structure for document listing queries for arbi...
متن کاملColored Range Queries and Document Retrieval
Colored range queries are a well-studied topic in computational geometry and database research that, in the past decade, have found exciting applications in information retrieval. In this paper we give improved time and space bounds for three important one-dimensional colored range queries — colored range listing, colored range top-k queries and colored range counting — and, thus, new bounds fo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملA General Document Retrieval in Compact Space
Given a collection of documents and a query pattern, document retrieval is the problem of obtaining documents that are relevant to the query. The collection is available beforehand so that a data structure, called an index, can be built on it to speed up queries. While initially restricted to natural language text collections, document retrieval problems arise nowadays in applications like bioi...
متن کامل